256 research outputs found
Towards Informative Few-Shot Prompt with Maximum Information Gain for In-Context Learning
Large Language models (LLMs) possess the capability to engage In-context
Learning (ICL) by leveraging a few demonstrations pertaining to a new
downstream task as conditions. However, this particular learning paradigm
suffers from high instability stemming from substantial variances induced by
factors such as the input distribution of selected examples, their ordering,
and prompt formats. In this work, we demonstrate that even when all these
factors are held constant, the random selection of examples still results in
high variance. Consequently, we aim to explore the informative ability of data
examples by quantifying the Information Gain (IG) obtained in prediction after
observing a given example candidate. Then we propose to sample those with
maximum IG. Additionally, we identify the presence of template bias, which can
lead to unfair evaluations of IG during the sampling process. To mitigate this
bias, we introduce Calibration Before Sampling strategy. The experimental
results illustrate that our proposed method can yield an average relative
improvement of 14.3% across six classification tasks using three LLMs.Comment: Accepted to the Findings of EMNLP 202
Dual Node and Edge Fairness-Aware Graph Partition
Fair graph partition of social networks is a crucial step toward ensuring
fair and non-discriminatory treatments in unsupervised user analysis. Current
fair partition methods typically consider node balance, a notion pursuing a
proportionally balanced number of nodes from all demographic groups, but ignore
the bias induced by imbalanced edges in each cluster. To address this gap, we
propose a notion edge balance to measure the proportion of edges connecting
different demographic groups in clusters. We analyze the relations between node
balance and edge balance, then with line graph transformations, we propose a
co-embedding framework to learn dual node and edge fairness-aware
representations for graph partition. We validate our framework through several
social network datasets and observe balanced partition in terms of both nodes
and edges along with good utility. Moreover, we demonstrate our fair partition
can be used as pseudo labels to facilitate graph neural networks to behave
fairly in node classification and link prediction tasks
Mining Label Distribution Drift in Unsupervised Domain Adaptation
Unsupervised domain adaptation targets to transfer task knowledge from
labeled source domain to related yet unlabeled target domain, and is catching
extensive interests from academic and industrial areas. Although tremendous
efforts along this direction have been made to minimize the domain divergence,
unfortunately, most of existing methods only manage part of the picture by
aligning feature representations from different domains. Beyond the discrepancy
in feature space, the gap between known source label and unknown target label
distribution, recognized as label distribution drift, is another crucial factor
raising domain divergence, and has not been paid enough attention and well
explored. From this point, in this paper, we first experimentally reveal how
label distribution drift brings negative effects on current domain adaptation
methods. Next, we propose Label distribution Matching Domain Adversarial
Network (LMDAN) to handle data distribution shift and label distribution drift
jointly. In LMDAN, label distribution drift problem is addressed by the
proposed source samples weighting strategy, which select samples to contribute
to positive adaptation and avoid negative effects brought by the mismatched in
label distribution. Finally, different from general domain adaptation
experiments, we modify domain adaptation datasets to create the considerable
label distribution drift between source and target domain. Numerical results
and empirical model analysis show that LMDAN delivers superior performance
compared to other state-of-the-art domain adaptation methods under such
scenarios
Tactical Trajectory Planning for Stealth Unmanned Aerial Vehicle to Win the Radar Game
In this paper, problem of planning tactical trajectory for stealth unmanned aerial vehicle (UAV) to win the radar game is studied. Three principles of how to win the radar game are presented, and their utilizations for stealth UAV to evade radar tracking are analysed. The problem is formulated by integrating the model of stealth UAV, the constraints of radar detecting and the multi-objectives of the game. The pseudospectral multi-phase optimal control based trajectory planning algorithm is developed to solve the formulated problem. Pseudospectral method is employed to seek the optimal solution with satisfying convergence speed. The results of experiments show that the proposed method is feasible and effective. By following the planned trajectory with several times of switches between exposure and stealth, stealth UAV could win the radar game triumphantly.Defence Science Journal, 2012, 62(6), pp.375-381, DOI:http://dx.doi.org/10.14429/dsj.62.268
Knowledge Reused Outlier Detection
Tremendous efforts have been invested in the unsupervised outlier detection research, which is conducted on unlabeled data set with abnormality assumptions. With abundant related labeled data available as auxiliary information, we consider transferring the knowledge from the labeled source data to facilitate the unsupervised outlier detection on target data set. To fully make use of the source knowledge, the source data and target data are put together for joint clustering and outlier detection using the source data cluster structure as a constraint. To achieve this, the categorical utility function is employed to regularize the partitions of target data to be consistent with source data labels. With an augmented matrix, the problem is completely solved by a K-means - a based method with the rigid mathematical formulation and theoretical convergence guarantee. We have used four real-world data sets and eight outlier detection methods of different kinds for extensive experiments and comparison. The results demonstrate the effectiveness and significant improvements of the proposed methods in terms of outlier detection and cluster validity metrics. Moreover, the parameter analysis is provided as a practical guide, and noisy source label analysis proves that the proposed method can handle real applications where source labels can be noisy
Affine Transformation Edited and Refined Deep Neural Network for Quantitative Susceptibility Mapping
Deep neural networks have demonstrated great potential in solving dipole
inversion for Quantitative Susceptibility Mapping (QSM). However, the
performances of most existing deep learning methods drastically degrade with
mismatched sequence parameters such as acquisition orientation and spatial
resolution. We propose an end-to-end AFfine Transformation Edited and Refined
(AFTER) deep neural network for QSM, which is robust against arbitrary
acquisition orientation and spatial resolution up to 0.6 mm isotropic at the
finest. The AFTER-QSM neural network starts with a forward affine
transformation layer, followed by an Unet for dipole inversion, then an inverse
affine transformation layer, followed by a Residual Dense Network (RDN) for QSM
refinement. Simulation and in-vivo experiments demonstrated that the proposed
AFTER-QSM network architecture had excellent generalizability. It can
successfully reconstruct susceptibility maps from highly oblique and
anisotropic scans, leading to the best image quality assessments in simulation
tests and suppressed streaking artifacts and noise levels for in-vivo
experiments compared with other methods. Furthermore, ablation studies showed
that the RDN refinement network significantly reduced image blurring and
susceptibility underestimation due to affine transformations. In addition, the
AFTER-QSM network substantially shortened the reconstruction time from minutes
using conventional methods to only a few seconds
Characterizing the Influence of Graph Elements
Influence function, a method from robust statistics, measures the changes of
model parameters or some functions about model parameters concerning the
removal or modification of training instances. It is an efficient and useful
post-hoc method for studying the interpretability of machine learning models
without the need for expensive model re-training. Recently, graph convolution
networks (GCNs), which operate on graph data, have attracted a great deal of
attention. However, there is no preceding research on the influence functions
of GCNs to shed light on the effects of removing training nodes/edges from an
input graph. Since the nodes/edges in a graph are interdependent in GCNs, it is
challenging to derive influence functions for GCNs. To fill this gap, we
started with the simple graph convolution (SGC) model that operates on an
attributed graph and formulated an influence function to approximate the
changes in model parameters when a node or an edge is removed from an
attributed graph. Moreover, we theoretically analyzed the error bound of the
estimated influence of removing an edge. We experimentally validated the
accuracy and effectiveness of our influence estimation function. In addition,
we showed that the influence function of an SGC model could be used to estimate
the impact of removing training nodes/edges on the test performance of the SGC
without re-training the model. Finally, we demonstrated how to use influence
functions to guide the adversarial attacks on GCNs effectively
Marginalized Latent Semantic Encoder for Zero-Shot Learning
Zero-shot learning has been well explored to precisely identify new unobserved classes through a visual-semantic function obtained from the existing objects. However, there exist two challenging obstacles: one is that the human-annotated semantics are insufficient to fully describe the visual samples; the other is the domain shift across existing and new classes. In this paper, we attempt to exploit the intrinsic relationship in the semantic manifold when given semantics are not enough to describe the visual objects, and enhance the generalization ability of the visual-semantic function with marginalized strategy. Specifically, we design a Marginalized Latent Semantic Encoder (MLSE), which is learned on the augmented seen visual features and the latent semantic representation. Meanwhile, latent semantics are discovered under an adaptive graph reconstruction scheme based on the provided semantics. Consequently, our proposed algorithm could enrich visual characteristics from seen classes, and well generalize to unobserved classes. Experimental results on zero-shot benchmarks demonstrate that the proposed model delivers superior performance over the state-of-the-art zero-shot learning approaches
- …